5 research outputs found

    Modeling Human Visual Search in Natural Scenes: A Combined Bayesian Searcher and Saliency Map Approach

    Get PDF
    Finding objects is essential for almost any daily-life visual task. Saliency models have been useful to predict fixation locations in natural images during a free-exploring task. However, it is still challenging to predict the sequence of fixations during visual search. Bayesian observer models are particularly suited for this task because they represent visual search as an active sampling process. Nevertheless, how they adapt to natural images remains largely unexplored. Here, we propose a unified Bayesian model for visual search guided by saliency maps as prior information. We validated our model with a visual search experiment in natural scenes. We showed that, although state-of-the-art saliency models performed well in predicting the first two fixations in a visual search task (90% of the performance achieved by humans), their performance degraded to chance afterward. Therefore, saliency maps alone could model bottom-up first impressions but they were not enough to explain scanpaths when top-down task information was critical. In contrast, our model led to human-like performance and scanpaths as revealed by: first, the agreement between targets found by the model and the humans on a trial-by-trial basis; and second, the scanpath similarity between the model and the humans, that makes the behavior of the model indistinguishable from that of humans. Altogether, the combination of deep neural networks based saliency models for image processing and a Bayesian framework for scanpath integration probes to be a powerful and flexible approach to model human behavior in natural scenarios.Fil: Buj铆a, Gast贸n Eli谩n. Consejo Nacional de Investigaciones Cient铆ficas y T茅cnicas. Oficina de Coordinaci贸n Administrativa Ciudad Universitaria. Instituto de Investigaci贸n en Ciencias de la Computaci贸n. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaci贸n en Ciencias de la Computaci贸n; Argentina. Consejo Nacional de Investigaciones Cient铆ficas y T茅cnicas. Oficina de Coordinaci贸n Administrativa Ciudad Universitaria. Instituto de Investigaci贸n en Ciencias de la Computaci贸n. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaci贸n en Ciencias de la Computaci贸n; ArgentinaFil: Sclar, Melanie. Consejo Nacional de Investigaciones Cient铆ficas y T茅cnicas. Oficina de Coordinaci贸n Administrativa Ciudad Universitaria. Instituto de Investigaci贸n en Ciencias de la Computaci贸n. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaci贸n en Ciencias de la Computaci贸n; ArgentinaFil: Vita, Sebasti谩n Alberto. Consejo Nacional de Investigaciones Cient铆ficas y T茅cnicas. Oficina de Coordinaci贸n Administrativa Ciudad Universitaria. Instituto de Investigaci贸n en Ciencias de la Computaci贸n. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaci贸n en Ciencias de la Computaci贸n; ArgentinaFil: Solovey, Guillermo. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Calculo. - Consejo Nacional de Investigaciones Cient铆ficas y T茅cnicas. Oficina de Coordinaci贸n Administrativa Ciudad Universitaria. Instituto de Calculo; ArgentinaFil: Kamienkowski, Juan Esteban. Consejo Nacional de Investigaciones Cient铆ficas y T茅cnicas. Oficina de Coordinaci贸n Administrativa Ciudad Universitaria. Instituto de Investigaci贸n en Ciencias de la Computaci贸n. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaci贸n en Ciencias de la Computaci贸n; Argentin

    BotPercent: Estimating Bot Populations in Twitter Communities

    Full text link
    Twitter bot detection is vital in combating misinformation and safeguarding the integrity of social media discourse. While malicious bots are becoming more and more sophisticated and personalized, standard bot detection approaches are still agnostic to social environments (henceforth, communities) the bots operate at. In this work, we introduce community-specific bot detection, estimating the percentage of bots given the context of a community. Our method -- BotPercent -- is an amalgamation of Twitter bot detection datasets and feature-, text-, and graph-based models, adjusted to a particular community on Twitter. We introduce an approach that performs confidence calibration across bot detection models, which addresses generalization issues in existing community-agnostic models targeting individual bots and leads to more accurate community-level bot estimations. Experiments demonstrate that BotPercent achieves state-of-the-art performance in community-level Twitter bot detection across both balanced and imbalanced class distribution settings, %outperforming existing approaches and presenting a less biased estimator of Twitter bot populations within the communities we analyze. We then analyze bot rates in several Twitter groups, including users who engage with partisan news media, political communities in different countries, and more. Our results reveal that the presence of Twitter bots is not homogeneous, but exhibiting a spatial-temporal distribution with considerable heterogeneity that should be taken into account for content moderation and social media policy making. The implementation of BotPercent is available at https://github.com/TamSiuhin/BotPercent.Comment: Accepted to findings of EMNLP 202

    FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

    Full text link
    Theory of mind (ToM) evaluations currently focus on testing models using passive narratives that inherently lack interactivity. We introduce FANToM, a new benchmark designed to stress-test ToM within information-asymmetric conversational contexts via question answering. Our benchmark draws upon important theoretical requisites from psychology and necessary empirical considerations when evaluating large language models (LLMs). In particular, we formulate multiple types of questions that demand the same underlying reasoning to identify illusory or false sense of ToM capabilities in LLMs. We show that FANToM is challenging for state-of-the-art LLMs, which perform significantly worse than humans even with chain-of-thought reasoning or fine-tuning.Comment: EMNLP 2023. Code and dataset can be found here: https://hyunw.kim/fanto

    Faith and Fate: Limits of Transformers on Compositionality

    Full text link
    Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify Transformers, we investigate the limits of these models across three representative compositional tasks -- multi-digit multiplication, logic grid puzzles, and a classic dynamic programming problem. These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer. We formulate compositional tasks as computation graphs to systematically quantify the level of complexity, and break down reasoning steps into intermediate sub-procedures. Our empirical findings suggest that Transformers solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills. To round off our empirical study, we provide theoretical arguments on abstract multi-step reasoning problems that highlight how Transformers' performance will rapidly decay with increased task complexity.Comment: 10 pages + appendix (21 pages
    corecore